Efficiency Considerations for Scalable Information Retrieval Servers

نویسندگان

  • Ophir Frieder
  • David A. Grossman
  • Abdur Chowdhury
  • Gideon Frieder
چکیده

We review a variety of techniques to improve efficiency in information retrieval. Given the increasing volumes of data that are available electronically, understanding and using such techniques is critical. We address several efficiency concerns, but our primary focus is on index processing since it dominates the computational demands of information retrieval. Given the importance of index processing, in addition to a general overview, we include some recent index maintenance results. These results demonstrate that by delaying the updating of the index when additional documents are introduced to the collection, efficiency is improved without noticeably degrading the effectiveness of information retrieval. We conclude with an overview of parallel processing in information retrieval. Since users cannot tolerate lengthy response times, searching large text databases requires vast computational resources. Parallel processing is currently the only means to support these demands. We focus on only those approaches that are currently commercially viable.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improved Skips for Faster Postings List Intersection

Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...

متن کامل

Improved Skips for Faster Postings List Intersection

Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...

متن کامل

Eeciency Considerations for Scalable Information Retrieval Servers

We overview a variety of techniques to improve eeciency in information retrieval. Given the increasing volumes of data that are available electronically, understanding and using such techniques is critical. We address several eeciency concerns, but our primary focus is on index processing since it dominates the computational demands of information retrieval. Given the importance of index proces...

متن کامل

Behavioral Considerations in Developing Web Information Systems: User-centered Design Agenda

The current paper explores designing a web information retrieval system regarding the searching behavior of users in real and everyday life. Designing an information system that is closely linked to human behavior is equally important for providers and the end users.  From an Information Science point of view, four approaches in designing information retrieval systems were identified as system-...

متن کامل

Text Based Approaches for Content Based Image Retrieval in a P2P Network

The tremendous growth of digital multimedia content on the web requires scalable, efficient, and effective information retrieval mechanisms. Handling such large collections of data in a centralized way requires costly high bandwidth connectivity and powerful servers. This establishes the need of distributed architectures, such as peer-to-peer systems, that allow sharing of data management and s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • J. Digit. Inf.

دوره 1  شماره 

صفحات  -

تاریخ انتشار 2000